{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting Started with BentoML\n",
    "\n",
    "[BentoML](http://bentoml.ai) is an open-source framework for machine learning **model serving**, aiming to **bridge the gap between Data Science and DevOps**.\n",
    "\n",
    "Data Scientists can easily package their models trained with any ML framework using BentoMl and reproduce the model for serving in production. BentoML helps with managing packaged models in the BentoML format, and allows DevOps to deploy them as online API serving endpoints or offline batch inference jobs, on any cloud platform.\n",
    "\n",
    "This getting started guide demonstrates how to use BentoML to serve a sklearn modeld via a REST API server, and then containerize the model server for production deployment.\n",
    "\n",
    "![Impression](https://www.google-analytics.com/collect?v=1&tid=UA-112879361-3&cid=555&t=event&ec=guides&ea=bentoml-quick-start-guide&dt=bentoml-quick-start-guide)\n",
    "\n",
    "BentoML requires python 3.6 or above, install dependencies via `pip`:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Install PyPI packages required in this guide, including BentoML\n",
    "!pip install -q --pre bentoml  # install preview version of BentoML for this guide\n",
    "!pip install -q 'scikit-learn>=0.23.2' 'pandas>=1.1.1'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Before starting, let's prepare a trained model for serving with BentoML. Train a classifier model on the [Iris data set](https://en.wikipedia.org/wiki/Iris_flower_data_set):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "SVC()"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn import svm\n",
    "from sklearn import datasets\n",
    "\n",
    "# Load training data\n",
    "iris = datasets.load_iris()\n",
    "X, y = iris.data, iris.target\n",
    "\n",
    "# Model Training\n",
    "clf = svm.SVC(gamma='scale')\n",
    "clf.fit(X, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Create a Prediction Service with BentoML\n",
    "\n",
    "Model serving with BentoML comes after a model is trained. The first step is creating a\n",
    "prediction service class, which defines the models required and the inference APIs which\n",
    "contains the serving logic. Here is a minimal prediction service created for serving\n",
    "the iris classifier model trained above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Overwriting iris_classifier.py\n"
     ]
    }
   ],
   "source": [
    "%%writefile iris_classifier.py\n",
    "import pandas as pd\n",
    "\n",
    "from bentoml import env, artifacts, api, BentoService\n",
    "from bentoml.adapters import DataframeInput\n",
    "from bentoml.frameworks.sklearn import SklearnModelArtifact\n",
    "\n",
    "@env(infer_pip_packages=True)\n",
    "@artifacts([SklearnModelArtifact('model')])\n",
    "class IrisClassifier(BentoService):\n",
    "    \"\"\"\n",
    "    A minimum prediction service exposing a Scikit-learn model\n",
    "    \"\"\"\n",
    "\n",
    "    @api(input=DataframeInput(), batch=True)\n",
    "    def predict(self, df: pd.DataFrame):\n",
    "        \"\"\"\n",
    "        An inference API named `predict` with Dataframe input adapter, which codifies\n",
    "        how HTTP requests or CSV files are converted to a pandas Dataframe object as the\n",
    "        inference API function input\n",
    "        \"\"\"\n",
    "        return self.artifacts.model.predict(df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This code defines a prediction service that packages a scikit-learn model and provides\n",
    "an inference API that expects a `pandas.Dataframe` object as its input. BentoML also supports other API input \n",
    "data types including `JsonInput`, `ImageInput`, `FileInput` and \n",
    "[more](https://docs.bentoml.org/en/latest/api/adapters.html).\n",
    "\n",
    "\n",
    "In BentoML, **all inference APIs are suppose to accept a list of inputs and return a \n",
    "list of results**. In the case of `DataframeInput`, each row of the dataframe is mapping\n",
    "to one prediction request received from the client. BentoML will convert HTTP JSON \n",
    "requests into :code:`pandas.DataFrame` object before passing it to the user-defined \n",
    "inference API function.\n",
    " \n",
    "This design allows BentoML to group API requests into small batches while serving online\n",
    "traffic. Comparing to a regular flask or FastAPI based model server, this can increases\n",
    "the overall throughput of the API server by 10-100x depending on the workload.\n",
    "\n",
    "The following code packages the trained model with the prediction service class\n",
    "`IrisClassifier` defined above, and then saves the IrisClassifier instance to disk \n",
    "in the BentoML format for distribution and deployment:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<iris_classifier.IrisClassifier at 0x7fbc9fb8d1d0>"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# import the IrisClassifier class defined above\n",
    "from iris_classifier import IrisClassifier\n",
    "\n",
    "# Create a iris classifier service instance\n",
    "iris_classifier_service = IrisClassifier()\n",
    "\n",
    "# Pack the newly trained model artifact\n",
    "iris_classifier_service.pack('model', clf)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>0</th>\n",
       "      <th>1</th>\n",
       "      <th>2</th>\n",
       "      <th>3</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>53</th>\n",
       "      <td>5.5</td>\n",
       "      <td>2.3</td>\n",
       "      <td>4.0</td>\n",
       "      <td>1.3</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>32</th>\n",
       "      <td>5.2</td>\n",
       "      <td>4.1</td>\n",
       "      <td>1.5</td>\n",
       "      <td>0.1</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8</th>\n",
       "      <td>4.4</td>\n",
       "      <td>2.9</td>\n",
       "      <td>1.4</td>\n",
       "      <td>0.2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>44</th>\n",
       "      <td>5.1</td>\n",
       "      <td>3.8</td>\n",
       "      <td>1.9</td>\n",
       "      <td>0.4</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>124</th>\n",
       "      <td>6.7</td>\n",
       "      <td>3.3</td>\n",
       "      <td>5.7</td>\n",
       "      <td>2.1</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       0    1    2    3\n",
       "53   5.5  2.3  4.0  1.3\n",
       "32   5.2  4.1  1.5  0.1\n",
       "8    4.4  2.9  1.4  0.2\n",
       "44   5.1  3.8  1.9  0.4\n",
       "124  6.7  3.3  5.7  2.1"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Prepare input data for testing the prediction service\n",
    "import pandas as pd\n",
    "test_input_df = pd.DataFrame(X).sample(n=5)\n",
    "test_input_df.to_csv(\"./test_input.csv\", index=False)\n",
    "test_input_df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 0, 0, 0, 2])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Test the service's inference API python interface\n",
    "iris_classifier_service.predict(test_input_df)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 02:35:51,955] INFO - BentoService bundle 'IrisClassifier:20210319023551_84AAF6' created at: /var/folders/7p/y_934t3s4yg8fx595vr28gym0000gn/T/tmpq19fw4uu\n",
      "[2021-03-19 02:35:51,961] INFO - ======= starting dev server on port: None =======\n",
      "[2021-03-19 02:35:53,093] DEBUG - Loaded logging configuration from default configuration and environment variables.\n",
      "[2021-03-19 02:35:53,094] DEBUG - Setting debug mode: ON for current session\n",
      "[2021-03-19 02:35:53,094] INFO - Starting BentoML API server in development mode..\n",
      "[2021-03-19 02:35:53,675] DEBUG - Using BentoML default docker base image 'bentoml/model-server:0.11.0-py37'\n",
      " * Serving Flask app \"IrisClassifier\" (lazy loading)\n",
      " * Environment: development\n",
      " * Debug mode: on\n",
      "[2021-03-19 02:36:03,098] INFO - {'service_name': 'IrisClassifier', 'service_version': '20210319023551_84AAF6', 'api': 'predict', 'task': {'data': '[[5.5, 2.3, 4.0, 1.3], [5.2, 4.1, 1.5, 0.1], [4.4, 2.9, 1.4, 0.2], [5.1, 3.8, 1.9, 0.4], [6.7, 3.3, 5.7, 2.1]]', 'task_id': '20b0dd53-d63b-47ed-8fc2-9d2303a1f7c7', 'batch': 5, 'http_headers': (('Host', '127.0.0.1:5000'), ('User-Agent', 'python-requests/2.22.0'), ('Accept-Encoding', 'gzip, deflate'), ('Accept', '*/*'), ('Connection', 'keep-alive'), ('Content-Length', '110'), ('Content-Type', 'application/json'))}, 'result': {'data': '[1, 0, 0, 0, 2]', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': '20b0dd53-d63b-47ed-8fc2-9d2303a1f7c7'}\n"
     ]
    }
   ],
   "source": [
    "# Start a dev model server to test out everything\n",
    "iris_classifier_service.start_dev_server()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 0, 0, 0, 2]\n"
     ]
    }
   ],
   "source": [
    "import requests\n",
    "response = requests.post(\n",
    "    \"http://127.0.0.1:5000/predict\",\n",
    "    json=test_input_df.values.tolist()\n",
    ")\n",
    "print(response.text)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Stop the dev model server\n",
    "iris_classifier_service.stop_dev_server()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 02:36:06,764] INFO - BentoService bundle 'IrisClassifier:20210319023551_84AAF6' saved to: /Users/chaoyu/bentoml/repository/IrisClassifier/20210319023551_84AAF6\n"
     ]
    }
   ],
   "source": [
    "# Save the prediction service to disk for deployment\n",
    "saved_path = iris_classifier_service.save()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "BentoML stores all packaged model files under the\n",
    "`~/bentoml/{service_name}/{service_version}` directory by default.\n",
    "The BentoML file format contains all the code, files, and configs required to \n",
    "deploy the model for serving.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## REST API Model Serving\n",
    "\n",
    "\n",
    "\n",
    "To start a REST API model server with the `IrisClassifier` saved above, use \n",
    "the `bentoml serve` command:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 02:37:09,964] INFO - Getting latest version IrisClassifier:20210319023551_84AAF6\n",
      "[2021-03-19 02:37:09,965] INFO - Starting BentoML API server in development mode..\n",
      " * Serving Flask app \"IrisClassifier\" (lazy loading)\n",
      " * Environment: production\n",
      "\u001b[31m   WARNING: This is a development server. Do not use it in a production deployment.\u001b[0m\n",
      "\u001b[2m   Use a production WSGI server instead.\u001b[0m\n",
      " * Debug mode: off\n",
      " * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET / HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /static_content/main.css HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /static_content/readme.css HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /static_content/swagger-ui.css HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /static_content/marked.min.js HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /static_content/swagger-ui-bundle.js HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:19] \"\u001b[37mGET /docs.json HTTP/1.1\u001b[0m\" 200 -\n",
      "127.0.0.1 - - [19/Mar/2021 02:37:20] \"\u001b[33mGET /favicon.ico HTTP/1.1\u001b[0m\" 404 -\n",
      "^C\n"
     ]
    }
   ],
   "source": [
    "!bentoml serve IrisClassifier:latest"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you are running this notebook from Google Colab, you can start the dev server with `--run-with-ngrok` option, to gain acccess to the API endpoint via a public endpoint managed by [ngrok](https://ngrok.com/): "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "scrolled": true
   },
   "outputs": [],
   "source": [
    "!bentoml serve IrisClassifier:latest --run-with-ngrok"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The `IrisClassifier` model is now served at `localhost:5000`. Use `curl` command to send\n",
    "a prediction request:\n",
    "\n",
    "```bash\n",
    "curl -i \\\n",
    "--header \"Content-Type: application/json\" \\\n",
    "--request POST \\\n",
    "--data '[[5.1, 3.5, 1.4, 0.2]]' \\\n",
    "localhost:5000/predict\n",
    "```\n",
    "\n",
    "Or with `python` and [request library](https://requests.readthedocs.io/):\n",
    "```python\n",
    "import requests\n",
    "response = requests.post(\"http://127.0.0.1:5000/predict\", json=[[5.1, 3.5, 1.4, 0.2]])\n",
    "print(response.text)\n",
    "```\n",
    "\n",
    "Note that BentoML API server automatically converts the Dataframe JSON format into a\n",
    "`pandas.DataFrame` object before sending it to the user-defined inference API function.\n",
    "\n",
    "The BentoML API server also provides a simple web UI dashboard.\n",
    "Go to http://localhost:5000 in the browser and use the Web UI to send\n",
    "prediction request:\n",
    "\n",
    "![BentoML API Server Web UI Screenshot](https://raw.githubusercontent.com/bentoml/BentoML/master/guides/quick-start/bento-api-server-web-ui.png)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Containerize model server with Docker\n",
    "\n",
    "\n",
    "\n",
    "One common way of distributing this model API server for production deployment, is via\n",
    "Docker containers. And BentoML provides a convenient way to do that.\n",
    "\n",
    "Note that `docker` is __not available in Google Colab__. You will need to download and run this notebook locally to try out this containerization with docker feature.\n",
    "\n",
    "If you already have docker configured, simply run the follow command to product a \n",
    "docker container serving the `IrisClassifier` prediction service created above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 02:37:37,423] INFO - Getting latest version IrisClassifier:20210319023551_84AAF6\n",
      "\u001b[39mFound Bento: /Users/chaoyu/bentoml/repository/IrisClassifier/20210319023551_84AAF6\u001b[0m\n",
      "Containerizing IrisClassifier:20210319023551_84AAF6 with local YataiService and docker daemon from local environment-\u001b[32mBuild container image: iris-classifier:v1\u001b[0m\n",
      "\b \r"
     ]
    }
   ],
   "source": [
    "!bentoml containerize IrisClassifier:latest -t iris-classifier:v1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Start a container with the docker image built in the previous step:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 09:37:42,276] INFO - Starting BentoML API server in production mode..\n",
      "[2021-03-19 09:37:42 +0000] [1] [INFO] Starting gunicorn 20.0.4\n",
      "[2021-03-19 09:37:42 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)\n",
      "[2021-03-19 09:37:42 +0000] [1] [INFO] Using worker: sync\n",
      "[2021-03-19 09:37:42 +0000] [12] [INFO] Booting worker with pid: 12\n",
      "[2021-03-19 09:37:42 +0000] [13] [INFO] Booting worker with pid: 13\n",
      "[2021-03-19 09:37:42,700] WARNING - Saved BentoService Python version mismatch: loading BentoService bundle created with Python version 3.7.8, but current environment version is 3.7.6.\n",
      "[2021-03-19 09:37:42,771] WARNING - Saved BentoService Python version mismatch: loading BentoService bundle created with Python version 3.7.8, but current environment version is 3.7.6.\n",
      "[2021-03-19 09:38:35,545] INFO - {'service_name': 'IrisClassifier', 'service_version': '20210319023551_84AAF6', 'api': 'predict', 'task': {'data': '[[5.5, 2.3, 4.0, 1.3], [5.2, 4.1, 1.5, 0.1], [4.4, 2.9, 1.4, 0.2], [5.1, 3.8, 1.9, 0.4], [6.7, 3.3, 5.7, 2.1]]', 'task_id': '3c12e3f8-f7e0-47ed-a055-ed0e62623e6e', 'batch': 5, 'http_headers': (('Host', 'localhost:5000'), ('User-Agent', 'curl/7.71.1'), ('Accept', '*/*'), ('Content-Type', 'application/json'), ('Content-Length', '110'))}, 'result': {'data': '[1, 0, 0, 0, 2]', 'http_status': 200, 'http_headers': (('Content-Type', 'application/json'),)}, 'request_id': '3c12e3f8-f7e0-47ed-a055-ed0e62623e6e'}\n",
      "^C\n",
      "[2021-03-19 09:38:49 +0000] [1] [INFO] Handling signal: int\n",
      "[2021-03-19 09:38:49 +0000] [12] [INFO] Worker exiting (pid: 12)\n",
      "[2021-03-19 09:38:49 +0000] [13] [INFO] Worker exiting (pid: 13)\n"
     ]
    }
   ],
   "source": [
    "!docker run -p 5000:5000 iris-classifier:v1 --workers=2"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This made it possible to deploy BentoML bundled ML models with platforms such as\n",
    "[Kubeflow](https://www.kubeflow.org/docs/components/serving/bentoml/),\n",
    "[Knative](https://knative.dev/community/samples/serving/machinelearning-python-bentoml/),\n",
    "[Kubernetes](https://docs.bentoml.org/en/latest/deployment/kubernetes.html), which\n",
    "provides advanced model deployment features such as auto-scaling, A/B testing,\n",
    "scale-to-zero, canary rollout and multi-armed bandit.\n",
    "\n",
    "\n",
    "## Load saved BentoService\n",
    "\n",
    "`bentoml.load` is the API for loading a BentoML packaged model in python:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[2021-03-19 02:38:54,032] WARNING - Module `iris_classifier` already loaded, using existing imported module.\n",
      "[2021-03-19 02:38:54,071] WARNING - pip package requirement pandas already exist\n",
      "[2021-03-19 02:38:54,072] WARNING - pip package requirement scikit-learn already exist\n"
     ]
    },
    {
     "data": {
      "text/plain": [
       "memmap([1, 0, 0, 0, 2])"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "import bentoml\n",
    "import pandas as pd\n",
    "\n",
    "bento_svc = bentoml.load(saved_path)\n",
    "\n",
    "# Test loaded bentoml service:\n",
    "bento_svc.predict(test_input_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The BentoML format is pip-installable and can be directly distributed as a\n",
    "PyPI package for using in python applications:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "\u001b[33m  WARNING: Built wheel for IrisClassifier is invalid: Metadata 1.2 mandates PEP 440 version, but '20210319023551-84AAF6' is not\u001b[0m\n",
      "\u001b[33m  DEPRECATION: IrisClassifier was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.\u001b[0m\n"
     ]
    }
   ],
   "source": [
    "!pip install -q {saved_path}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "memmap([1, 0, 0, 0, 2])"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# The BentoService class name will become packaged name\n",
    "import IrisClassifier\n",
    "\n",
    "installed_svc = IrisClassifier.load()\n",
    "installed_svc.predict(test_input_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "This also allow users to upload their BentoService to pypi.org as public python package\n",
    "or to their organization's private PyPi index to share with other developers.\n",
    "\n",
    "`cd {saved_path} & python setup.py sdist upload`\n",
    "\n",
    "*You will have to configure \".pypirc\" file before uploading to pypi index.\n",
    "    You can find more information about distributing python package at:\n",
    "    https://docs.python.org/3.7/distributing/index.html#distributing-index*\n",
    "\n",
    "\n",
    "# Launch inference job from CLI"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "BentoML cli supports loading and running a packaged model from CLI. With the `DataframeInput` adapter, the CLI command supports reading input Dataframe data from CLI argument or local `csv` or `json` files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 0, 0, 0, 2]\r\n"
     ]
    }
   ],
   "source": [
    "!bentoml run IrisClassifier:latest predict --input '{test_input_df.to_json()}' --quiet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 0, 0, 0, 2]\r\n"
     ]
    }
   ],
   "source": [
    "!bentoml run IrisClassifier:latest predict \\\n",
    "    --input-file \"./test_input.csv\" --format \"csv\" --quiet"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[1, 0, 0, 0, 2]\r\n"
     ]
    }
   ],
   "source": [
    "# run inference with the docker image built above\n",
    "!docker run -v $(PWD):/tmp iris-classifier:v1 \\\n",
    "        bentoml run /bento predict --input-file \"/tmp/test_input.csv\" --format \"csv\" --quiet"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Deployment Options\n",
    "\n",
    "Check out the [BentoML deployment guide](https://docs.bentoml.org/en/latest/deployment/index.html)\n",
    "to better understand which deployment option is best suited for your use case.\n",
    "\n",
    "* One-click deployment with BentoML:\n",
    "  - [AWS Lambda](https://docs.bentoml.org/en/latest/deployment/aws_lambda.html)\n",
    "  - [AWS SageMaker](https://docs.bentoml.org/en/latest/deployment/aws_sagemaker.html)\n",
    "  - [AWS EC2](https://docs.bentoml.org/en/latest/deployment/aws_ec2.html)\n",
    "  - [Azure Functions](https://docs.bentoml.org/en/latest/deployment/azure_functions.html)\n",
    "\n",
    "* Deploy with open-source platforms:\n",
    "  - [Docker](https://docs.bentoml.org/en/latest/deployment/docker.html)\n",
    "  - [Kubernetes](https://docs.bentoml.org/en/latest/deployment/kubernetes.html)\n",
    "  - [Knative](https://docs.bentoml.org/en/latest/deployment/knative.html)\n",
    "  - [Kubeflow](https://docs.bentoml.org/en/latest/deployment/kubeflow.html)\n",
    "  - [KFServing](https://docs.bentoml.org/en/latest/deployment/kfserving.html)\n",
    "  - [Clipper](https://docs.bentoml.org/en/latest/deployment/clipper.html)\n",
    "\n",
    "* Manual cloud deployment guides:\n",
    "  - [AWS ECS](https://docs.bentoml.org/en/latest/deployment/aws_ecs.html)\n",
    "  - [Google Cloud Run](https://docs.bentoml.org/en/latest/deployment/google_cloud_run.html)\n",
    "  - [Azure container instance](https://docs.bentoml.org/en/latest/deployment/azure_container_instance.html)\n",
    "  - [Heroku](https://docs.bentoml.org/en/latest/deployment/heroku.html)\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Summary\n",
    "\n",
    "This is what it looks like when using BentoML to serve and deploy a model in the cloud. BentoML also supports [many other Machine Learning frameworks](https://docs.bentoml.org/en/latest/examples.html) besides Scikit-learn. The [BentoML core concepts](https://docs.bentoml.org/en/latest/concepts.html) doc is recommended for anyone looking to get a deeper understanding of BentoML.\n",
    "\n",
    "Join the [BentoML Slack](https://join.slack.com/t/bentoml/shared_invite/enQtNjcyMTY3MjE4NTgzLTU3ZDc1MWM5MzQxMWQxMzJiNTc1MTJmMzYzMTYwMjQ0OGEwNDFmZDkzYWQxNzgxYWNhNjAxZjk4MzI4OGY1Yjg) to follow the latest development updates and roadmap discussions."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
